U.S. Application No. 09/642,771 

Amendment to the Claims : 

This listing of claims will replace all prior versions, and listings, of claims in the 
application: 

Listing of Claims : 

1 . (currently amended) A word importance calculation method for calculating 
the importance of words contained in a document set, whereby the difference 
between the word distribution in a subset of whol e docum e nts which cons i sts of 
every document containing a specified word and the word distribution in th e set a set 
of whole documents including said subset is used to calculate the importance of the 
word. 

2. (original) A word importance calculation method, as claimed in Claim 1, 
wherein: 

said difference is determined by comparing the distance d between said 
subset and said set of whole documents with the distance d\ or the estimated value 
of d\ between another subset of documents which contain substantially the same 
number of words as said subset of documents and are randomly selected from said 
set -subset of whole documents, and said set of whole documents. 

3. (currently amended) A word importance calculation method, as claimed in 
Claim 2, wherein: 
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the distance d between the two document sets is calculated by us i ng th e word 
d i str i but i on i n oach documont sot, that i s to say using the probability of occurrence of 
each word in each of said document set. 

4. (currently amended) A word importance calculation method, as claimed in 
Claim 2, wherein: 

tf -in case that t he number of documents containing said word is larger than a 
prescribed number, a preset number of documents are extracted from the said 
subset of whole documents by random sampling, and the difference between the 
extracted set of documents and said set of whole documents is used instead of the 
difference between the original subset of documents and the set of whole 
documents. 

5. (currently amended) A document retrieval interface having a function to 
display on a screen words characterizing a document set, wherein the importance of 
each word occurring in the sot a set of whole documents is calculated using the 
difference between the word distribution in the subset of who l e documonts every 
document containing the word and the word distribution in the set of whole 
documents including said subset , and the importance is brought to bear on the 
selection, arrangement or coloring of the words displayed on the screen. 

6. (original) A document retrieval interface having a function to display on a 
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screen words characterizing a document set, wherein the importance of each word 
occurring in the document set obtained as a result of retrieval is calculated using the 
difference between the word distribution in the subset of documents out of the 
document set obtained as a result of that retrieval containing that word and the word 
distribution in the document set obtained as a result of that retrieval, and the 
importance is brought to bear on the selection, arrangement or coloring of the words 
displayed on the screen. 



7. (currently amended) A word dictionary construction method by extracting 
important words from a document set in accordance with rules given in advance, 
wherein the importance of each word occurring in a set of whole documents is 
calculated using the difference between a subset of whol e documonts everv 
document containing the word and the word distribution in the set of whole 
documents including said subset , and words to be extracted are selected on the 
basis of that importance. 



8. - 9. (canceled) 



10. (new) A word importance calculation method according to claim 1 , said 
difference being defined by 

D(W) being said subset, 
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DO being the set of the whole document, 

kj being a frequency of an occurrence of the specified word in D(W), 
Kj being a frequency of an occurrence of the specified word in D(0). 
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